Picture for Michael Qizhe Shieh

Michael Qizhe Shieh

ClawMark: A Living-World Benchmark for Multi-Turn, Multi-Day, Multimodal Coworker Agents

Add code
Apr 26, 2026
Viaarxiv icon

Chasing the Public Score: User Pressure and Evaluation Exploitation in Coding Agent Workflows

Add code
Apr 22, 2026
Viaarxiv icon

Your Agent, Their Asset: A Real-World Safety Analysis of OpenClaw

Add code
Apr 06, 2026
Viaarxiv icon

Gym-V: A Unified Vision Environment System for Agentic Vision Research

Add code
Mar 17, 2026
Viaarxiv icon

In-Context Reinforcement Learning for Tool Use in Large Language Models

Add code
Mar 09, 2026
Viaarxiv icon

ImageEdit-R1: Boosting Multi-Agent Image Editing via Reinforcement Learning

Add code
Mar 09, 2026
Viaarxiv icon

LongRLVR: Long-Context Reinforcement Learning Requires Verifiable Context Rewards

Add code
Mar 02, 2026
Viaarxiv icon

Gradually Compacting Large Language Models for Reasoning Like a Boiling Frog

Add code
Feb 04, 2026
Viaarxiv icon

ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

Add code
Oct 30, 2025
Figure 1 for ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning
Figure 2 for ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning
Figure 3 for ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning
Figure 4 for ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning
Viaarxiv icon

The Emergence of Abstract Thought in Large Language Models Beyond Any Language

Add code
Jun 11, 2025
Viaarxiv icon